Energy controlled insertion of polar molecules in dense fluids 
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We present a method to search low energy configurations of polar molecules in the complex poten- 
tial energy surfaces associated with dense fluids. The search is done in the configurational space of 
the translational and rotational degrees of freedom of the molecule, combining steepest-descent and 
Newton- Raphson steps which embed information on the average sizes of the potential energy wells 
obtained from prior inspection of the liquid structure. We perform a molecular dynamics simula- 
tion of a liquid water shell which demonstrates that the method enables fast and energy-controlled 
water molecule insertion in aqueous environments. The algorithm finds low energy configurations 
of incoming water molecules around three orders of magnitude faster than direct random insertion. 
This method is an important step towards dynamic simulations of open systems and it may also 
prove useful for energy-biased ensemble average calculations of the chemical potential. 
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Many processes of physical, chemical and biological 
interest involve open systems which exchange matter 
with their surroundings. Molecular dynamics (MD) and 
Monte Carlo (MC) simulations of these systems often 
require a method for molecule insertion and, therefore, 
a method for searching configurations with prescribed 
(low) potential energy. Indeed, a randomly placed 
molecule is likely to overlap with pre-existing atoms, re- 
leasing into the system a very high amount of energy. 

The most natural setting for these systems is the grand 
canonical (GC) ensemble. Several methods for GC sim- 
ulations require the location of energy cavities for inser- 
tion (such as cavity-biased methods for GCMGiSi 3 .) or 
careful control of the solvent insertion energy in the case 
of GCMD^. Mass, momentum and energy transfer are 
also a key feature of a class of hybrid methods for non- 
equilibrium simulations which couple an open MD region 
with an interfacing continuum- fluid-dynamics domain 6,7 . 
Open boundaries in such hybrid schemes can avoid finite 
size effects in small MD simulation boxes^, thereby saving 
on computational time. These sort of open boundaries 
could also be used to improve the closed "water shells" 
widely used to hydrate restricted subdomains^ in many 
MD simulations of biological systems. 

Water insertion is also particularly important in pro- 
tein simulations. For instance, it is possible to study pro- 
tein unfolding via gradual water insertion in the protein's 
cavitiesi2*ii. On the other hand, water molecules buried 
in protein cavities at very low energies are essential for 
protein structure and functioniSii 3 ^. Indeed, some tools 
for MD simulations (such as DOWSEfti^) are specialised 
for water insertion in hydrophilic cavities, leaving empty 
however the larger hydrophobic cavities which frequently 
contain stable yet disordered water molecules relevant to 
protein functioni 3 ^^. 

Several methods for the calculation of ensemble aver- 
ages require sampling the potential energy released to the 
system upon insertion of a test moleculei^i 6 ^ 7 ^. Exam- 
ples include calculation of the chemical potential, hydra- 



tion energies and pair distribution functions^. The ap- 
plicability of these methods can be expanded to dense flu- 
ids using techniques that bias the sampling towards low 
energy configurations. Some of these techniques, such as 
cavity-biased^SiSi or excluded volume mapSS sampling, 
are however hampered by the considerable amount of 
time needed to find "cavities" where the test molecule 
could be inserted without overlapping with others. In 
fact, these cavities are just proxies to search low energy 
configurations which could better be identified by an en- 
ergy controlled insertion method. 

The algorithms for water insertion proposed in the lit- 
erature usually involve rather lengthy steps which com- 
prise three separate parts: location of a suitable "cavity" , 
normally using an expensive grid search with O(10 6 ) dif- 
ferent cells«*2i; random insertion in the cavity, followed 
by a large number of energy minimisation steps (either of 
the inserted molecul oV 2 or of the entire systemiS) and, 
finally, thermostatting the whole system over a one to ten 
picoseconds period to extract the extra energy released 
upon insertion. In this article, we present a method to 
locate low energy configurations of dense liquids that al- 
lows insertion of solvent molecules on-the-fly: avoiding 
expensive grid search, non-local energy minimisation and 
thermostatting steps. 

On the potential energy surface, low energies are lo- 
cated inside energy wells whose local minima span a rel- 
atively large range of energy values. The main idea of 
the present method is to reconstruct the energy land- 
scape with a limited number of probes by constraining 
the search to be inside the energy wells. In fact, any ex- 
cursion outside the explored well implies the loss of all the 
information accumulated on the current well which is ef- 
fectively equivalent to a random restart. Efficiency is ob- 
tained by minimising both the number of probes needed 
to determine if the target energy is found within the 
well and the number of explored wells per successful in- 
sertion. The present minimisation algorithm generalises 
non-trivially to multiple degrees of freedom the usher al- 
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gorithm for insertion of Lennard- Jones atomsS 3 ". It shares 
with some other global minimisation methods the recipe 
of applying in turns random moves and local energy 
minimisation^iSiiiSi. However, it is distinguished from 
these others in the way the minimisation is performed via 
a combined steepest-descent and Newton-Raphson iter- 
ator which is tailored adaptively to the structure of the 
potential energy landscape being searched. 

The method uses local information on the gradient 
and the average size of the potential wells, which are 
dependent on the molecule's location and the thermody- 
namic state respectively. The input parameters specify 
the maximum distance AR and rotation angle A0 that 
the incoming molecule can jump without exiting the cur- 
rent well together with a measure of the roughness of the 
potential energy surface AEr. The insertion algorithm 
starts by selecting a random location for the centre of 
mass of the molecule and placing the atoms at the equi- 
librium bond and angle positions in a random orienta- 
tion. The non-bonded potential energy of an incoming 
molecule is given by 

U = \j2 v ^) + \j2 v c(n 3 ), (l) 

where Vlj and Vc are the Lennard- Jones and Coulomb 
pair potentials respectively^ and the index i runs over 
the atoms of the molecule and j over all other atoms, 
which remain fixed while inserting. The energy E = 
2U released to the system upon insertion is computed 
and compared with the target energy Et- The insertion 
succeeds once the energy difference AE = E — Et is 
less than a certain prescribed tolerance set here at 10 -3 
Kcal/mol. 

It is likely that for the random starting configuration 
AE will be a large positive value because there is a high 
chance that the inserted molecule will overlap with oth- 
ers. Then, the force F = Fj applied to the centre 
of mass r cm and torque r = J2i v cm,i x Fj are used 
to compute the next displacement and rotation. Here, 
the index i runs over the atoms of the inserted molecule 
and r cm ,i = Vi — r cm . The molecule is translated by 
5r = mm(AE/F,AR) where F is the magnitude of the 
force on the centre of mass and AR is the maximum 
displacement. With the reference system fixed to the 
molecule, we then compute the rotation angle around 
the torque axis 58 = mm(AE/r,AQ) and rotate the 
molecule around the centre of mass. The resulting up- 
date rule is finally given by 

"cm 

r n+l _ r n , g 

cm cm 1 p n ' 

where 1Z is the rotation matrix around the axis of torque 
of angle 58. This is equivalent to a first order steep- 
est descent procedure for large energy differences and a 
second order Newton method for energy close to the tar- 
get energjiS 3 .. The angular minimisation is stopped when 
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FIG. 1: Contour plot of the potential energy landscape in 
Kcal/mol: (a) for translation relative to the axes i and j fixed 
to the water molecule; (b) for a rotation 8 about the axis j 
and 4> about the axis k for an equilibrated periodic liquid wa- 
ter system at 300K and density 0.96 g/cm 3 . The maximum 
translational displacement AR — 1.0 A and maximum rota- 
tional angles A0 = A$ = 45° are indicated by double-headed 
arrows. For visual convenience angles smaller than —90° and 
larger than 90° in 9 are plotted although being redundant. 

the angle 58 is less than 1° to avoid oscillations due to 
the coupling of rotational and translational degrees of 
freedom. If during the iterations AE increases by more 
than AEr then the current attempt is abandoned and a 
new random configuration is generated. This provides a 
threshold to control the amount of time spent searching 
in the well and the number of wells explored. 

The insertion algorithm in Eq. (J2J does not require 
a baroque implementation and indeed can be easily in- 
cluded in any molecular dynamics program. The code 
used here is based on the serial version of a well estab- 
lished parallel molecular dynamics code NAMD with the 
Charmm27 force fieldSZ-, but it has been designed to in- 
terface easily with any other serial or parallel MD code. 
The search algorithm applies in general to small polar 
molecules but given its importance we focus on controlled 
insertion of water molecules in aqueous environments. 
We use the TIP3P model for water, widely utilised in 
biological simulations 28 . This water model is based on 
three interaction sites, bonds (O-H) and angle (H-O-H) 
being constrained rigidly or, in its flexible version (used 
here) , by a harmonic potential with equilibrium configu- 
rations of 0.96 A and 104.52° respectively. 

As stated, the restriction on the maximum displace- 
ment and rotation has the effect of limiting the search 
to the current potential well. For water, the maximum 
displacement can be extracted from the oxygen-hydrogen 
pair distribution function gon^- We found that an op- 
timum value for the maximum displacement AR = 1 A 
is half of the first peak in goH which is around 2 A. Ex- 
ploring the potential energy landscape provides another 
simple way of obtaining the input parameters. In Fig. 
we show a cross-section of the potential energy surface for 
a displacement of up to 5 A around an equilibrated water 
molecule in the direction of the axes i and j. The unit 
vectors i, j, k form a reference system fixed rigidly to the 
water molecule with the axis i being in the direction of 
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the dipole. As shown in Fig. the optimum value of AR 
is approximately the radius of the potential energy well, 
corroborating information furnished from the pair distri- 
bution function. It is more difficult to obtain structural 
information for the angular degrees of freedom. However, 
a simple inspection of Fig. ^> provides a gross estimate of 
potential energy wells in the rotational degrees of freedom 
as being between 90 — 100° wide; therefore the maximum 
rotation can be fixed at AG = 45°. The value of AEr, 
which sets the maximum uphill energy jump allowed in 
one move, is important to reduce the number of unsuc- 
cessful wells explored. We found that an optimal value 
is near AEr = 3 Kcal/mol. 

It is well known that the local structure of liquid wa- 
ter at equilibrium consists of a hydrogen bond network 
formed by oxygen and hydrogen atoms from neighbour- 
ing water molecules. This structure makes it very hard 
for an incoming water molecule to find low energy con- 
figurations by forming hydrogen bonds with pre-existing 
molecules. However, the insertion algorithm needs only 
to control the thermodynamics by inputting into the sys- 
tem a specified amount of energy which depends on the 
ensemble considered. We performed an MD simulation of 
bulk water using a simple spherical water shell to show 
that it is possible to insert water molecules on-the-fly 
while precisely controlling the energy released to the sys- 
tem. In a previous work<2£ considering Lennard-Jones 
atoms, it was shown that this procedure ensures thermo- 
dynamic consistency after a relaxation time of the order 
of the collision time. We set up an equilibrated TIP3P 
bulk water system within a sphere of radius 37.5 A at 
300K and a pressure of 1 atm. The simulations were run 
with a 12 A cutoff radius and without corrections to the 
long ranged electrostatic forces^ 7 .. The water molecules 
in the outer shell of length d — 12.5 A play the role of a 
reservoir confined in the sphere by a simple constant ra- 
dial force field specified by an acceleration g acting only 
within the outer shell. The effect of this force is a linear 
decay of the pressure in the water shell according to the 
usual formula for the hydrostatic pressure in an incom- 
pressible fluid Pi = P — pgd, where Pi is the pressure at 
the surface of the water sphere and Pq is the pressure of 
the bulk that we want to maintain. We impose Pi = 
by setting g = P /(pd). 

In the present set up, the flow rate of molecules to 
the inner shell is controlled by the applied pressure force, 
while the number of reservoir molecules in the outer shell 
is fixed at the bulk density. This implies that molecules 
which, due to fluctuations or sudden pressure waves, 
move outside the sphere are removed and reinserted using 
the insertion method at a random location in the outer 
shell, with a velocity given by the Maxwell-Boltzmann 
distribution at 300K. We note that the present setting 
can be generalised to avoid finite-size effects due to pe- 
riodic boundary conditions in a hydrodynamically con- 
sistent wayi The total energy of the system can be 
fixed by setting the amount of energy released upon inser- 
tion equal to the energy lost when a molecule moves out 
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FIG. 2: Number of energy evaluations per molecule required 
to insert a water molecule while releasing an energy less than 
E (Kcal/mol) to the system. The proposed insertion algo- 
rithm (crosses) is around three orders of magnitude faster 
than random insertion (circles) at low energies. The his- 
togram for random insertion is computed from 10 7 trials. 



through the open boundary^. On average, the exchanged 
potential energy per molecule is equal to the mean en- 
ergy per molecule: by inserting at this energy target we 
kept the total energy under control (without drift) with 
no thermostat at all. In other situations, such as at con- 
stant temperature, it is sufficient to release a moderately 
greater energy, for example equal to the excess chemical 
potential, which can be thermalized dynamically by the 
thermostat. 

An estimate of the efficiency of this insertion method 
can be obtained by determining the average number of 
energy evaluations, including failed well searches, needed 
to insert a single water molecule at the specified energy. 
Each iteration of the insertion algorithm corresponds to 
one energy evaluation on the solvent molecule, which is 
a three atom-force calculation for TIP3P water. In par- 
ticular, it takes an average of 206 iterations, exploring 34 
wells, to insert at the reference energy of the mean energy 
per molecule (—11 Kcal/mol), and 36 iterations (only 6 
wells) at the energy of the excess chemical potential (- 
5.8 Kcal/mol, calculated using the Bennett methodic). 
We note that the computational cost required by the 
insertion method in a typical MD simulation is quite 
small. For instance, in the simulation of the open water 
shell mentioned above, incoming water molecules were 
inserted at a target energy of Et = — llKcal/mol within 
a volume of 155. 4nm 3 at a rate of 141 per picosecond. 
The amount of CPU time devoted to insertion was only 
3% of the grand total of the simulation. 

Interestingly, the mean number of iterations to explore 
a well which leads to the correct target energy is only 
around 12, independent of the target energy. The method 
may be improved further by reducing the total number of 
searched wells but it is already optimal in the sense that 
the number of iterations to explore a single well does 
not depend on the target energy. Future applications 
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may require searching many more degrees of freedom, 
e.g. conformational searches, for which it is impractical 
to fix each maximum displacement a priori. In this case, 
it would be useful to set up an adaptive rule to infer the 
input parameters from the efficacy of the search itself. 

It is useful to compare our insertion algorithm with a 
direct random insertion. To this end, the probability dis- 
tribution f(E) of releasing a total energy E upon random 
insertion was estimated by computing a histogram from 
10 7 random insertion trials. The number of trials re- 
quired to obtain an energy smaller than E is given by the 
reciprocal of the cumulative distribution 1/F(E) where 

F(E) = f(E')dE'. This number is compared with 
the number of iterations (energy evaluations) required by 
the insertion algorithm in Fig. (0). The insertion algo- 
rithm is around three orders of magnitude faster than 
a random insertion for energies lower than the chemical 
potential and so may provide an efficient alternative to bi- 
ased methods, such as cavity-biased sampling 20,21 , to re- 
construct the probability distribution f(E). Indeed, the 
present algorithm enables one to identify the important 
low energy regions very accurately where an un-biased 
sampling can be performed. This appealing approach 
enables fast computation of the chemical potential from 
the probability distribution f(E) at low energies^. 
In summary, we have reported a new method for the 



insertion of polar molecules in dense fluids by a gener- 
alisation of the USHER protocol 2 ^. The energy minimi- 
sation is applied concurrently to all degrees of freedom 
(translational and rotational for water) and is indepen- 
dent of the specific potential used. Indeed, the method 
is even more general. It may be applied to other prob- 
lems related to conformational searches and minima of 
potential energy surfaces with many more degrees free- 
dom. Given its importance for computational biology, 
we focused on water and demonstrated that it is possible 
to efficiently insert water molecules in aqueous environ- 
ments while controlling the thermodynamic state. This 
task is commonly considered to be very time consuming, 
but we are able to achieve it at negligible computational 
cost thanks to a very efficient configurational search al- 
gorithm. The present algorithm is an essential tool for 
performing hybrid MD-continuum simulations 6 8 of bio- 
logical interest. Indeed, it represents an important step 
towards a general method for performing MD simulations 
of open systems, for which a dynamic calculation of the 
chemical potentialiS*2£ could be used to control the inser- 
tion rate so as to maintain constant the solvent chemical 
potential. 
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